Optimizing I/O Costs of Multi-dimensional Queries Using Bitmap Indices
نویسندگان
چکیده
Bitmap indices are efficient data structures for processing complex, multi-dimensional queries in data warehouse applications and scientific data analysis. For high-cardinality attributes, a common approach is to build bitmap indices with binning. This technique partitions the attribute values into a number of ranges, called bins, and uses bitmap vectors to represent bins (attribute ranges) rather than distinct values. In order to yield exact query answers, parts of the original data values have to be read from disk for checking against the query constraint. This process is referred to as candidate check and usually dominates the total query processing time. In this paper we study several strategies for optimizing the candidate check cost for multi-dimensional queries. We present an efficient candidate check algorithm based on attribute value distribution, query distribution as well as query selectivity with respect to each dimension. We also show that re-ordering the dimensions during query evaluation can be used to reduce I/O costs. We tested our algorithm on data with various attribute value distributions and query distributions. Our approach shows a significant improvement over traditional binning strategies for
منابع مشابه
Towards Optimal Multi-Dimensional Query Processing with Bitmap Indices
Bitmap indices have been widely used in scientific applications and commercial systems for processing complex, multi-dimensional queries where traditional tree-based indices would not work efficiently. This paper studies strategies for minimizing the access costs for processing multi-dimensional queries using bitmap indices with binning. Innovative features of our algorithm include (a) optimall...
متن کاملArray-Based Evaluation of Multi-Dimensional Queries in Object-Relational Databases Systems
Since multi-dimensional arrays are a natural data structure for supporting multi-dimensional queries, and object-relational database systems support multi-dimensional array ADTs, it is natural to ask if a multi-dimensional array-based ADT can be used to improve O/R DBMS performance on multi-dimensional queries. As an initial step toward answering this question, we have implemented a multi-dimen...
متن کاملArray-Based Evaluation of Multi-Dimensional Queries in Object-Relational Database Systems
Since multi-dimensional arrays are a natural data structure for supporting multi-dimensional queries, and object-relational database systems support multi-dimensional array ADTs, it is natural to ask if a multi-dimensional array-based ADT can be used to improve O/R DBMS performance on multi-dimensional queries. As an initial step toward answering this question, we have implemented a multi-dimen...
متن کاملDesign and Implementation of Bitmap Indices for Scientific Data
Bitmap indices are efficient multi-dimensional index data structures for handling complex adhoc queries in readmostly environments. They have been implemented in several commercial database systems but are only well suited for discrete attribute values which are very common in typical business applications. However, many scientific applications usually operate on floating point numbers and cann...
متن کاملBitmap Indices for Fast End-User Physics Analysis in ROOT
Most physics analysis jobs involve multiple selection steps on the input data. These selection steps are called cuts or queries. A common strategy to implement these queries is to read all input data from files and then process the queries in memory. In many applications the number of variables used to define these queries is a relative small portion of the overall data set therefore reading al...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005